Obtaining Japanese Lexical Units for Semantic Frames from Berkeley FrameNet Using a Bilingual Corpus
نویسندگان
چکیده
An attempt was made to semi-automatically obtain “lexical units” (LUs) for Japanese from the English LUs defined in the semantic frame database provided by Berkeley FrameNet (BFN) using an English-Japanese bilingual corpus. This task was a prerequisite to building a complete database of semantic frames for Japanese. In the task, a Japanese word is first translated into an English word or phrase, E. E is one of the lexical units that evoked a particular semantic frame, F , in the BFN database. When other lexical units of F are translated back into Japanese, this defines a candidate set of F for the lexical units of F in Japanese. The viability of the proposed method was tested on a Japanese verb (X-ga Y -wo) osou (roughly meaning “X attack(s) Y ,” “X hit(s) Y ,” “X surprise(s) Y ” in English, showing that it is a relatively polysemous word). The resulting translation was compared to semantic descriptions provided by IPAL and Nihongo Goi-Taikei (A Japanese Lexicon), two well-known language resources for Japanese, and also by the Frame Oriented Concept Analysis of Language (FOCAL). The comparison revealed that FOCAL, BFN, Goi Taikei, and IPAL provided finer-grained descriptions in this specific order.
منابع مشابه
How FrameSQL Shows the Japanese FrameNet Data
FrameSQL is a web-based application which the author (Sato, 2003; Sato 2008) created originally for searching the Berkeley FrameNet lexical database. FrameSQL now can handle the Japanese lexical database built by the Japanese FrameNet project (JFN) of Keio University in Japan. FrameSQL can search and view the JFN data released in March of 2009 on a standard web browser. Users do not need to ins...
متن کاملBuilding an Italian FrameNet through Semi-automatic Corpus Analysis
In this paper, we outline the methodology we adopted to develop a FrameNet for Italian. The main element of novelty with respect to the original FrameNet is represented by the fact that the creation and annotation of Lexical Units is strictly grounded in distributional information (statistical distribution of verbal subcategorization frames, lexical and semantic preferences of each frame) autom...
متن کاملSemantic Annotations in Japanese FrameNet: Comparing Frames in Japanese and English
Since 2008, the Japanese FrameNet (JFN, http://jfn.st.hc.keio.ac.jp/) project has been annotating the Balanced Corpus of Contemporary Written Japanese (BCCWJ), the first such corpus, officially released in October 2011. This paper reports annotation results of the book genre of BCCWJ (Ohara 2011, Ohara, Saito, Fujii & Sato 2011). Comparing the semantic frames needed to annotate BCCWJ with those...
متن کاملFrameBank: A Database of Russian Lexical Constructions
Russian FrameBank is a bank of annotated samples from the Russian National Corpus which documents the use of lexical constructions (e.g. argument constructions of verbs and nouns). FrameBank belongs to FrameNetoriented resources, but unlike Berkeley FrameNet it focuses more on the morphosyntactic and semantic features of individual lexemes rather than the generalized frames, following the theor...
متن کاملLexicon and Grammar in Bulgarian FrameNet
In this paper, we report on our attempt at assigning semantic information from the English FrameNet to lexical units in the Bulgarian valency lexicon. The paper briefly presents the model underlying the Bulgarian FrameNet (BulFrameNet): each lexical entry consists of a lexical unit; a semantic frame from the English FrameNet, expressing abstract semantic structure; a grammatical class, defining...
متن کامل